Skip to content

[[BP-62] Expose batchReadUnconfirmedAsync to ReadHandle#4739

Merged
merlimat merged 3 commits intoapache:masterfrom
dao-jun:dev/expose_batchread_unconfirmed
May 7, 2026
Merged

[[BP-62] Expose batchReadUnconfirmedAsync to ReadHandle#4739
merlimat merged 3 commits intoapache:masterfrom
dao-jun:dev/expose_batchread_unconfirmed

Conversation

@dao-jun
Copy link
Copy Markdown
Member

@dao-jun dao-jun commented Apr 6, 2026

This is a supplement to BP62

Motivation

batchReadUnconfirmedAsync doesn't exposed to ReadHandle now, downstream projects maybe depend on the method.

Changes

  1. Expose batchReadUnconfirmedAsync to ReadHandle
  2. Add tests

In order to uphold a high standard for quality for code contributions, Apache BookKeeper runs various precommit
checks for pull requests. A pull request can only be merged when it passes precommit checks.


Otherwise:

  • Make sure the PR title is formatted like:
    <Issue #>: Description of pull request
    e.g. Issue 123: Description ...
  • Make sure tests pass via mvn clean apache-rat:check install spotbugs:check.
  • Replace <Issue #> in the title with the actual Issue number.

@dao-jun dao-jun changed the title Expose batchReadUnconfirmedAsync to ReadHandle [[BP-62] Expose batchReadUnconfirmedAsync to ReadHandle Apr 6, 2026
Copy link
Copy Markdown
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall the code change looks good

Please add unit tests about error scenarios (input validations, internal errors....)
errors must be reported accurately in the CompletableFuture

Also test the fallback case where batch reads are not available

@dao-jun dao-jun closed this Apr 9, 2026
@dao-jun dao-jun reopened this Apr 9, 2026
@dao-jun dao-jun requested a review from eolivelli April 9, 2026 14:41
@dao-jun
Copy link
Copy Markdown
Member Author

dao-jun commented Apr 10, 2026

@hangc0276 @lhotari @zymap PTAL

Copy link
Copy Markdown
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Patch looks good to me, as far as CI passes.
I have restarted the failed jobs

@void-ptr974
Copy link
Copy Markdown
Contributor

#4743

this fix the flaky test. @dao-jun @eolivelli PTAL

@dao-jun
Copy link
Copy Markdown
Member Author

dao-jun commented Apr 21, 2026

/bkbot rerun-failure

@merlimat merlimat merged commit 58c8503 into apache:master May 7, 2026
34 of 36 checks passed
merlimat added a commit that referenced this pull request May 7, 2026
The batchReadUnconfirmedAsync method added in #4739 calls LOG.error(...),
but LedgerHandle was migrated to slog and only has a lowercase `log`
field. Master fails to compile.

Convert the call to the slog builder style used elsewhere in the file.
merlimat added a commit that referenced this pull request May 8, 2026
* Fix LedgerHandle.batchReadUnconfirmedAsync: use slog log instead of LOG

The batchReadUnconfirmedAsync method added in #4739 calls LOG.error(...),
but LedgerHandle was migrated to slog and only has a lowercase `log`
field. Master fails to compile.

Convert the call to the slog builder style used elsewhere in the file.

* Migrate DataFormats and DbLedgerStorageDataFormats from protobuf-java to LightProto

Replace Google's protobuf-java with StreamNative LightProto for the storage and
metadata formats in `bookkeeper-proto`. The wire protocol (`BookkeeperProtocol`)
remains on protobuf-java for now.

LightProto generates mutable, reusable, ByteBuf-aware classes with built-in
proto2 TextFormat (de)serialization (via `generateTextFormat=true`), so the
existing TextFormat-based znode payloads (cookies, auditor votes, lock data,
underreplication entries, layout) round-trip byte-identically.

Notable behavior changes:
- `BookKeeper.DigestType.toProtoDigestType` now returns the LightProto-generated
  enum (same constants, different package).
- v3 ledger metadata uses a hand-rolled length-prefixed delimited writer/reader
  matching protobuf's `writeDelimitedTo`/`mergeDelimitedFrom`.

* Add SpotBugs excludes for lightproto-generated classes

The existing exclude `~org.apache.bookkeeper.proto.DataFormats.*` matched
protobuf's nested `DataFormats$LedgerMetadataFormat` etc. LightProto generates
the same messages as flat top-level classes (`LedgerMetadataFormat` directly
in `org.apache.bookkeeper.proto`), so those weren't excluded and triggered
12 bugs in the generated code (bit-twiddling, exposed internal byte arrays,
etc.) that aren't actionable.

Replace the obsolete `DataFormats.*` exclude with explicit per-message
patterns covering both packages and `LightProtoCodec` (also generated
per-package).

* Fix checkstyle: remove redundant same-package import in MockBookies
lhotari pushed a commit that referenced this pull request May 8, 2026
…OG (#4782)

The batchReadUnconfirmedAsync method added in #4739 calls LOG.error(...),
but LedgerHandle was migrated to slog and only has a lowercase `log`
field. Master fails to compile.

Convert the call to the slog builder style used elsewhere in the file.
lhotari pushed a commit that referenced this pull request May 8, 2026
* Fix LedgerHandle.batchReadUnconfirmedAsync: use slog log instead of LOG

The batchReadUnconfirmedAsync method added in #4739 calls LOG.error(...),
but LedgerHandle was migrated to slog and only has a lowercase `log`
field. Master fails to compile.

Convert the call to the slog builder style used elsewhere in the file.

* Migrate stream non-gRPC protos to LightProto

Migrates the stream module's non-gRPC proto definitions to LightProto:
- stream/statelib/src/main/proto/kv.proto (KV state-store commands)
- stream/proto/src/main/proto/cluster.proto (cluster metadata/assignment)

Both files use proto3, including oneof (kv.proto) and map (cluster.proto),
which LightProto handles directly.

For stream/proto, cluster.proto is moved to a parallel proto-lightproto/
directory so the existing protobuf-maven-plugin keeps generating the rest
of the protos (which still depend on the gRPC service stubs). Java sources
that touched the migrated types are updated to use the LightProto
mutable-instance API instead of the protobuf-java builder pattern.

This change is independent of the other in-flight LightProto work in
bookkeeper-proto and bookkeeper-server. gRPC service migration is left
for a follow-up.

* Fix checkstyle: remove unused ServerAssignmentData import
merlimat added a commit that referenced this pull request May 8, 2026
* Fix LedgerHandle.batchReadUnconfirmedAsync: use slog log instead of LOG

The batchReadUnconfirmedAsync method added in #4739 calls LOG.error(...),
but LedgerHandle was migrated to slog and only has a lowercase `log`
field. Master fails to compile.

Convert the call to the slog builder style used elsewhere in the file.

* Migrate DataFormats and DbLedgerStorageDataFormats from protobuf-java to LightProto

Replace Google's protobuf-java with StreamNative LightProto for the storage and
metadata formats in `bookkeeper-proto`. The wire protocol (`BookkeeperProtocol`)
remains on protobuf-java for now.

LightProto generates mutable, reusable, ByteBuf-aware classes with built-in
proto2 TextFormat (de)serialization (via `generateTextFormat=true`), so the
existing TextFormat-based znode payloads (cookies, auditor votes, lock data,
underreplication entries, layout) round-trip byte-identically.

Notable behavior changes:
- `BookKeeper.DigestType.toProtoDigestType` now returns the LightProto-generated
  enum (same constants, different package).
- v3 ledger metadata uses a hand-rolled length-prefixed delimited writer/reader
  matching protobuf's `writeDelimitedTo`/`mergeDelimitedFrom`.

* Add SpotBugs excludes for lightproto-generated classes

The existing exclude `~org.apache.bookkeeper.proto.DataFormats.*` matched
protobuf's nested `DataFormats$LedgerMetadataFormat` etc. LightProto generates
the same messages as flat top-level classes (`LedgerMetadataFormat` directly
in `org.apache.bookkeeper.proto`), so those weren't excluded and triggered
12 bugs in the generated code (bit-twiddling, exposed internal byte arrays,
etc.) that aren't actionable.

Replace the obsolete `DataFormats.*` exclude with explicit per-message
patterns covering both packages and `LightProtoCodec` (also generated
per-package).

* Fix checkstyle: remove redundant same-package import in MockBookies

* Migrate BookkeeperProtocol from protobuf-java to LightProto

Migrates the BookkeeperProtocol.proto wire protocol to use LightProto for
serialization. Combined with the prior migrations of DataFormats and
DbLedgerStorageDataFormats, this drops the protobuf-java runtime dependency
from bookkeeper-proto entirely.

LightProto produces wire-compatible output with protoc for the same .proto,
so on-the-wire bookie/client compatibility is preserved.

Notes on lifecycle handling:
- LightProto messages parsed from a ByteBuf hold lazy references into that
  buffer for field access. The decoders now call materialize() on parsed
  Request/Response/AuthMessage instances so they survive after the source
  buffer is released.
- Server response paths that put entry payloads into ReadLacResponse or
  ReadResponse now copy the bytes via ByteBufUtil.getBytes(...), matching
  the previous ByteString.copyFrom semantics.

Drive-by fix: processWriteLacRequestV3/processReadLacRequestV3 were
ordering work on r.getAddRequest().getLedgerId() instead of the matching
WriteLac/ReadLac request. With protobuf this returned a default 0 for the
unset field; with LightProto it throws IllegalStateException.

* Fix checkstyle: remove redundant same-package imports

* Fix shaded-jar tests after protobuf-java removal

protobuf-java is no longer pulled in transitively by bookkeeper-server,
so the shaded jars no longer contain (shaded) protobuf classes, and the
flat lightproto-generated classes have replaced the BookkeeperProtocol
outer class.

- BookKeeperServerShadedJarTest / DistributedLogCoreShadedJarTest:
  drop the now-irrelevant testProtobufShadedPath checks and switch the
  BookkeeperProtocol presence check to a real lightproto class
  (AddRequest).
- Drop the dead com.google.protobuf:protobuf-java <include> from the
  three shade plugin configs (bookkeeper-server-shaded,
  bookkeeper-server-tests-shaded, distributedlog-core-shaded).

* Use ByteBufList.toByteBuf to avoid copying request bodies

Replace ByteBufList.coalesce(...) with a new ByteBufList#toByteBuf method
on the WriteLac and AddEntry request paths. coalesce allocates a new
buffer and copies all the bytes; toByteBuf wraps the existing buffers in
a CompositeByteBuf (or returns the single buffer directly when the list
has one entry, or Unpooled.EMPTY_BUFFER when empty), transferring
ownership to the caller and releasing the source ByteBufList.

Adds unit tests covering the empty / single / multi-buffer paths,
including ref-count behaviour for both the list and the underlying
buffers.

* Fix ByteBufList.toByteBuf to not release the source list

The previous implementation released the source ByteBufList when
producing the wrapping ByteBuf, but callers in PerChannelBookieClient
only wrap a buffer they don't own &mdash; the ByteBufList's lifecycle is
managed by the upstream PendingAddOp, which shares the same list across
multiple bookies in a quorum write. Releasing in toByteBuf produced a
double-release / use-after-free that surfaced as
IllegalReferenceCountException in tests like BookieStickyReadsTest.

Change toByteBuf to leave the source list's ref count untouched and
return a wrapper that holds its own retains (a CompositeByteBuf for
multiple buffers, a retainedDuplicate for the single-buffer fast path,
or Unpooled.EMPTY_BUFFER for empty). This matches the original
ByteBufList.coalesce semantics from the caller's point of view, while
still avoiding the byte copy.

Tests updated to assert the new ownership semantics.

* Don't read required ledgerId/entryId from error responses

Four completion handlers (AddCompletion, ForceLedgerCompletion,
ReadCompletion, WriteLacCompletion) were always reading the inner
*Response's ledgerId/entryId fields, even when the outer Response
carried an error status. On error responses (e.g. EUA from a rejected
SASL handshake) those required fields are not populated. Under
protobuf-java they returned the default 0; under LightProto they throw
IllegalStateException("Field 'ledgerId' is not set"), which surfaced as
GSSAPIBookKeeperTest.testNotAllowedClientId blowing up on the client
side after the server rejected the auth.

Read the inner response only when status is EOK and hasXxxResponse() is
true. Otherwise fall back to the request's ledgerId/entryId, which the
CompletionValue base class already records.

* Read response ledgerId/entryId on long-poll reads in ReadCompletion

Long-poll reads send entryId=LAST_ADD_CONFIRMED and the bookie fills in
the actual entry id (and ledgerId) on the response when an entry is
returned. The previous fix in ReadCompletion always used the request's
recorded entryId, which made the long-poll path look like an empty
piggy-back response on the client side and dropped the entry buffer.

Read ledgerId/entryId from the response when status is EOK and the
inner ReadResponse is present; fall back to the request's recorded
values only on error envelopes (where the inner response may be
missing or unpopulated). Fixes
TestReadLastConfirmedAndEntry.testRaceOnLastAddConfirmed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants